Quantization of Deep Neural Networks for Accurate Edge Computing
نویسندگان
چکیده
Deep neural networks have demonstrated their great potential in recent years, exceeding the performance of human experts a wide range applications. Due to large sizes, however, compression techniques such as weight quantization and pruning are usually applied before they can be accommodated on edge. It is generally believed that leads degradation, plenty existing works explored strategies aiming at minimum accuracy loss. In this paper, we argue quantization, which essentially imposes regularization representations, sometimes help improve accuracy. We conduct comprehensive experiments three widely used applications: fully connected network for biomedical image segmentation, convolutional classification ImageNet, recurrent automatic speech recognition, experimental results show by 1%, 1.95%, 4.23% applications respectively with 3.5x-6.4x memory reduction.
منابع مشابه
Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization
Low-bit deep neural networks (DNNs) become critical for embedded applications due to their low storage requirement and computing efficiency. However, they suffer much from the non-negligible accuracy drop. This paper proposes the stochastic quantization (SQ) algorithm for learning accurate low-bit DNNs. The motivation is due to the following observation. Existing training algorithms approximate...
متن کاملResiliency of Deep Neural Networks under Quantization
The complexity of deep neural network algorithms for hardware implementation can be much lowered by optimizing the word-length of weights and signals. Direct quantization of floating-point weights, however, does not show good performance when the number of bits assigned is small. Retraining of quantized networks has been developed to relieve this problem. In this work, the effects of quantizati...
متن کاملAdaptive Quantization for Deep Neural Network
In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large memory consumption, which may not be affordable for mobile platforms. Deep model quantization can be used for reducing the computation and memory costs of DNNs...
متن کاملDeep Neural Networks: Another Tool for Multimedia Computing
O ver the years, the multimedia research community has leveraged many computational tools to advance its state of the art. Tools such as hidden Markov models (HMMs), support vector machines (SVMs), and particle filters have been used in multimedia content analysis, multimedia system design, and various multimedia applications. About eight years ago, another tool emerged: deep neural networks. D...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Journal on Emerging Technologies in Computing Systems
سال: 2021
ISSN: ['1550-4832', '1550-4840']
DOI: https://doi.org/10.1145/3451211